Empowering ELAN with N-gram Analytics for Corpora
نویسنده
چکیده
American Sign Language, the preferred language of the Deaf community in the USA is its own language; complete with a rich collection of grammatical features. The DePaul University team has been working on an automatic English/ASL translator implemented as a 3D avatar in order to facilitate better communication between hearing and deaf people. Animations suffered from various timing inconsistencies and were awkward in appearance. This project attempts to address them by using corpus analysis to discern subtle features of ASL in order to improve the coarticulation model in the avatar.
منابع مشابه
Expanding n-gram analytics in ELAN and a case study for sign synthesis
A new extension to ELAN offers expanded n-gram analysis tools including improved search capabilities and an extensive library of statistical measures of association for n-grams. This paper presents an overview of the new tools and a case study in American Sign Language synthesis that exploits these capabilities for computing more natural timing in generated sentences. The new extension provides...
متن کاملUsing N-Gram Analytics to Improve Automatic Fingerspelling Generation
Fingerspelling recognition is one of the last skills acquired, due to the complex nature of fingerspelling and a lack of display technology that is sufficiently natural for recognition practice. This paper describes a corpus-based study utilizing an n-gram extension to ELAN to gain a deeper understanding of deletion and coarticulation in fingerspelling. The analysis shows that coarticulation an...
متن کاملEvaluation of a Stack Decoder on a Japanese Newspaper Dictation Task
This paper describes the evaluation of the !V$N$>$_!W stack decoder for LVCSR on a 5000 word Japanese newspaper dictation task [3]. Using continuous density acoustic models with 2000 and 3000 states trained on the JNAS/ASJ corpora and a 3-gram LM trained on the RWC text corpus, both models provided by the IPA group, it was possible to reach more than 95% word accuracy on the standard test set. ...
متن کاملSpock - a Spoken Corpus Client
Spock is an open source tool for the easy deployment of time-aligned corpora. It is fully web-based, and has very limited server-side requirements. It allows the end-user to search the corpus in a text-driven manner, obtaining both the transcription and the corresponding sound fragment in the result page. Spock has an administration environment to help manage the sound files and their respectiv...
متن کاملNormalising the IJS-ELAN Slovene-English Parallel Corpus for the Extraction of Multilingual Terminology
Various efforts have been made for the development of tools and methods dedicated to the automatic processing of multilingual terminology databases. For that purpose, multilingual parallel corpora have been used as a basis resource. However, most of the neologisms in technical and scientific domains are realised by multiword terms that are rarely identified in parallel corpora. In this paper, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013